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Abstract 

Block diagonalization (BD) is a practical linear precoding technique that eliminates the inter-user interference 
in downlink multiuser multiple-input multiple-output (MIMO) systems. In this paper, we apply BD to the downlink 
transmission in a cooperative multi-cell MIMO system, where the signals from different base stations (BSs) to all 
the mobile stations (MSs) are jointly designed with the perfect knowledge of the downlink channels and transmit 
messages. Specifically, we study the optimal BD precoder design to maximize the weighted sum-rate of all the MSs 
subject to a set of per-BS power constraints. This design problem is formulated in an auxiliary MIMO broadcast 
channel (BC) with a set of transmit power constraints corresponding to those for individual BSs in the multi- 
cell system. By applying convex optimization techniques, this paper develops an efficient algorithm to solve this 
problem, and derives the closed-form expression for the optimal BD precoding matrix. It is revealed that the optimal 
BD precoding vectors for each MS in the per-BS power constraint case are in general non-orthogonal, which differs 
from the conventional orthogonal BD precoder design for the MIMO-BC under one single sum-power constraint. 
Moreover, for the special case of single-antenna BSs and MSs, the proposed solution reduces to the optimal 
zero-forcing beamforming (ZF-BF) precoder design for the weighted sum-rate maximization in the multiple-input 
single-output (MISO) BC with per-antenna power constraints. Suboptimal and low-complexity BD/ZF-BF precoding 
schemes are also presented, and their achievable rates are compared against those with the optimal schemes. 

Index Terms 

Block diagonalization, convex optimization, cooperative multi-cell system, multi-antenna broadcast channel, 
network MIMO, per-antenna power constraint, per-base-station power constraint, zero-forcing beamforming. 

I. Introduction 

The study of downlink beamforming and power control in cellular systems has been an active area of research 

for many years. Conventionally, most of the related works have focused on a single-cell setup, where the co-channel 

interferences experienced by the mobile stations (MSs) in a particular cell caused by the base stations (BSs) of the 

other cells are treated as additional noises at the receivers. For this setup, the downlink transmission in a single 

cell with a multi-antenna BS and multiple single -/multi-antenna MSs can be modeled by a multiple-input single - 

/multiple-output (MISO/MIMO) broadcast channel (BC). It is known that the dirty paper coding (DPC) technique 
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achieves the capacity region for the Gaussian MISO/MIMO BC, which constitutes all the simultaneously achievable 
rates for all the MSs [1]. However, DPC requires complicated nonlinear encoding and decoding schemes and is 
thus difficult to implement in real-time systems. Consequently, linear transmit and receive beamforming schemes 
for the Gaussian MISO/MIMO BC have drawn a great deal of attention in the literature flU, 0, H, |0, @. 
In particular, a simple linear precoding scheme for the MIMO BC is known as block diagonalization (BD) Q, 
All, 0, ifTOl . With BD, the transmitted signal from the BS intended for each MS is multiplied by a precoding 
matrix, which is restricted to be orthogonal to the downlink channels associated with all the other MSs. Thereby, 
all the inter-user interferences are eliminated and each MS perceives an interference-free MIMO channel. In the 
special case of MISO BC, BD reduces to the well-known zero-forcing beamforming (ZF-BF) Q. Although BD is 
in general inferior in terms of achievable rate as compared to the DPC-based optimal nonlinear precoding scheme 
or the minimum-mean-squared-error (MMSE)-based optimal linear precoding scheme, it performs very well in the 
high signal-to-noise-ratio (SNR) regime and achieves the same degrees of freedom (DoF) for the MIS07MIMO-BC 
sum-rate as the optimal linear/nonlinear precoding schemes ifTTTl . Moreover, BD can be generalized to incorporate 
nonlinear DPC processing, which leads to a precoding scheme known as ZF-DPC iPTTTl . 

Recently, there has been a rapidly growing interest in shifting the design paradigm from the conventional single- 
cell downlink transmission to the multi-cell cooperative downlink transmission |[T2l . |[T3l . lfl4l . |[T5l . fTo*! . iTTTl . In 
these studies, it is assumed that BSs in a cellular network are connected via backhaul links to a central processing 
unit (e.g., a dedicated control station or a preassigned BS), which has the global knowledge of transmit messages for 
all the MSs in the network and downlink channels from each BS to all the MSs. Thereby, the central processing unit 
is able to jointly design the downlink transmissions for all BSs and provide them appropriate signals to transmit. As 
demonstrated in these works, by utilizing the co-channel interference across different cells in a coherent fashion, 
the cooperative multi-cell downlink processing leads to enormous throughput gains as compared to the conventional 
single-cell processing with the co-channel interference treated as noise. Moreover, design of distributed multi-cell 
downlink beamforming via the use of belief propagation and message passing among BSs has also been recently 
proposed in ITT81 . without the need of a central controller. 

In this work, we focus our study on the BD-based downlink precoding for a fully cooperative multi-cell system 
equipped with a central processing unit, which is assumed to have the perfect knowledge of all downlink channels 
and transmit messages in the network. For this setup, the BD precoding design problem can be formulated 
in an auxiliary MISO/MIMO BC with the number of transmitting antennas equal to the sum of those from 
all the cooperative BSs. However, instead of adopting the conventional sum-power constraint for the auxiliary 
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MISO/MIMO BC as in prior works Q, (H, J9), ifTOl . this paper applies a set of transmit power constraints 
equivalent to those for individual BSs in the multi-cell system. The BD precoder design problem subject to per-BS 
power constraints is relatively new, and has been studied in, e.g., Ifl9l . EOl . ll2Ti . In these works, the BD precoders 
are designed essentially following the same principle as for the conventional sum-power constraint case, i.e., the 
precoding vectors known for the sum-power constraint case are adopted, and then power allocation is optimization 
to maximize the sum-rate under per-BS power constraints. However, it remains unclear whether the developed BD 
precoder solutions therein are indeed optimal for the weighted sum-rate maximization in a cooperative multi-cell 
system. In this paper, we show that the BD precoder designs following the heuristic of separating the beamforming 
design and power allocation optimization are indeed suboptimal for rate maximization, while the optimal BD 
precoder solution requires a new joint optimization approach, as will be proposed in this paper. 

It is worth noting that the computation problem for the achievable rate region of the Gaussian MISO/MIMO BC 
subject to per-antenna power constraints has been studied in 1221 . This work has been recently extended in |[23l , 
E4l to deal with more general linear transmit power constraints for the MISO/MIMO BC, with the per-antenna 
power constraint as a special case. The results in [22], 11231 , E4ll can be directly applied for a cooperative multi-cell 
system to handle the per-BS power constraints, if the DPC-based optimal nonlinear precoder or the MMSE-based 
optimal linear precoder is used. On the other hand, the ZF-BF precoding design, as a simplified version of BD for 
the case of MISO BC, has been studied in |fT9l . 11241 . ||25| subject to per-antenna power constraints. In |[T9l , the ZF- 
BF precoding matrix is taken as the pseudo inverse of the MISO-BC channel matrix and thereby decomposes the 
MISO BC into parallel interference-free scalar sub-channels for different MSs. The power allocation over the sub- 
channels is then optimized under per-antenna power constraints. However, it was pointed out in [25 ] that although 
the ZF-BF precoding matrix for the MISO BC based on the channel pseudo inverse is optimal for the sum-power 
constraint case, it is in general suboptimal for the per-antenna power constraint case. Thus, in ll25l the authors 
proposed to apply the principle of generalized matrix inverse to design the ZF-BF precoding with per-antenna 
power constraints. The scheme in [25 ] has been improved in terms of computational efficiency and extended to 
the case of general linear power constraints in 11241 . However, these MISO-BC ZF-BF solutions cannot be applied 
to obtain the optimal BD precoder design for the more general MEVIO BC with per-BS power constraints. 

The main contributions of this paper are summarized as follows: 

• We formulate the MIMO-BC transmit optimization problem with the BD precoding and equivalent per-BS 
power constraints as a convex optimization problem. By applying convex optimization techniques, we design 
an efficient algorithm to solve this problem. We also derive the closed-form expression of the optimal BD 
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precoding matrix to maximize the weighted sum-rate for the MIMO-BC, from which we obtain a lower bound 
on the number of BSs that should transmit with their maximum power levels. More importantly, we prove that 
the optimal BD precoding (beamforming) vectors for each MS in the case of per-BS power constraints are 
in general non-orthogonal, which differs from the conventional orthogonal BD precoder design for the sum- 
power constraint case. Consequently, the orthogonal BD precoder designs proposed in prior works ||T9l , |[20l . 
[21] for the per-BS power constraint case are in general suboptimal (for weighted sum-rate maximization). 

• For the special case of single-antenna BSs and MSs, we show that the proposed BD precoding design for the 
MIMO-BC provides the optimal ZF-BF precoder solution to maximize the weighted sum-rate for the MISO 
BC with per-antenna power constraints. We also compare the proposed solution with existing ones in prior 
works B4l . ||2"5l based on approaches such as the generalized channel matrix inverse and the semi-definite 
programming (SDP) with rank-one relaxation. 

• We also present a low-complexity, suboptimal scheme for the studied problem, which is obtained by computing 
the conventional BD precoder design for the sum-power constraint case with an optimal power allocation to 
meet the per-BS power constraints. This scheme can be considered as an extension of that given in |[T9l 
for the MISO BC with the ZF-BF precoding and per-antenna power constraints to the MIMO BC with the 
BD precoding and per-BS power constraints. We derive an upper bound on the maximum number of BSs 
transmitting with their full power levels for this scheme, and identify the conditions under which this scheme 
becomes sum-rate optimal. 

The rest of this paper is organized as follows. Section[II]introduces the signal model for the downlink transmission 
in a cooperative multi-cell system, and presents the problem formulation for the weighted sum-rate maximization 
with the BD precoding and per-BS power constraints. Section [III] derives the optimal solution for this problem, and 
characterizes the optimal solution for the special case of MISO BC with per-antenna power constraints. Section 
HVl develops a heuristic suboptimal scheme for the studied problem. Section [V] provides numerical examples on 
the performance of the proposed optimal and suboptimal schemes. Finally, Section [VT] concludes the paper. 

Notations: Scalars are denoted by lower-case letters, vectors denoted by bold-face lower-case letters, and matrices 
denoted by bold-face upper-case letters. J and denote an identity matrix and an all-zero matrix, respectively, with 
appropriate dimensions. For a square matrix S, Tr(S), \S\, S" 1 , and S 1 ' 2 denote the trace, determinant, inverse 
(if S is full-rank), and square -root of S, respectively; and S >z (S < 0) means that S is positive (negative) 
semi-definite. Diag(a) denotes a diagonal matrix with the main diagonal given by a. For a matrix M of arbitrary 
size, M H , M T , Rank(iW), and denote the conjugate transpose, transpose, rank, and pseudo inverse of M, 
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respectively. E[-] denotes the statistical expectation. The distribution of a circularly symmetric complex Gaussian 
(CSCG) random vector with mean vector x and covariance matrix ~E is denoted by CJ\f(x, X); and ~ stands for 
"distributed as". C xxy denotes the space of x x y complex matrices. ||a?|| denotes the Euclidean norm of a complex 
vector x. 

II. System Model and Problem Formulation 

We consider a multi-cell system consisting of A cells, each of which has a BS to coordinate the transmission 
with K a MSs, K a > 1 and a = 1, • • • , A. Denote the total number of MSs in the system as K = J2a=i K a - For 
convenience, we assume that all the BSs are equipped with the same number of antennas, denoted by Mb > 1. 
Denote the total number of antennas across all the BSs as M = MbA. We also assume that each of K MSs is 
equipped with N antennas, N > 1. Since we are interested in a fully cooperative multi-cell system, the jointly 
designed downlink transmission for all the BSs can be conveniently modeled by an auxiliary MIMO BC with M 
transmitting antennas and K MSs each having N receiving antennas. For convenience, we assign the indices to 
the transmitting antennas in the auxiliary MIMO BC belonging to different BSs in the multi-cell system according 
to the BS index, i.e., the ((a — 1)Mb + l)-th to (aMs)-th antennas are taken as the Mb antennas from the ath 
BS, a = 1, • • • , A. Similarly, the indices of MSs in the auxiliary MIMO BC are assigned according to their cell 
indices, i.e., the (Ztl K i + to (£?=i K i)-& M Ss are taken as the K a MSs from the ath cell, a = 1, • • • , A. 
Accordingly, the discrete-time baseband signal of the auxiliary MIMO BC is given by 

y k = H k x k + ^2,H k Xj + z k , k = l,--- ,K (1) 

where x k G c Mxl and y k G £ Nxl denote the transmitted and received signals for the kth MS, respectively; 
H k G £, NxM denotes the downlink channel from all the M base-station antennas to the kth MS; and z k G C Nxl 
denotes the receiver noise at the fcth MS. For convenience, we assume that z k ~ CM(0, 1),\/k. 
Without loss of generality, we can further express x k as 

x k = T k s k , k = l,...,K (2) 

where T k G C MxDk is the precoding matrix (which specifies both the transmit beamforming vectors and allocated 
power values for different beams) for the A;th MS ; D k denotes the number of transmitted data steams for the kth MS 
due to spatial multiplexing, with D k < min(M, N), Vfc; and s k G C DkXl denotes the information-bearing signal 
for the kth MS. We assume that s^'s are independent over k. It is further assumed that a Gaussian codebook is used 
for each MS at the transmitter and thus s k ~ CJ\f(0, 1), Denote Sk = K[x k x k ] as the transmit covariance 
matrix for the kth MS, with S k G C MxM and S k h 0. It is easy to verify that S k = T k T k . The overall 



downlink transmit covariance matrix for the M cooperative transmitting antennas is then S = Ylk=i ^k- Since 
these transmitting antennas come from more than one BS, they need to satisfy a set of per-BS power constraints 
expressed as 

K 

Ti(B a S) <Por ^Tr(B a S fc ) < P, a = l,---,A (3) 

k=i 

where 




_0 ) (4) 

(a-l)M B M B (A-a)M B 

and P denotes the per-BS power constraint, which is assumed identical for all the BSs. Note that in the special case 
of single-antenna BSs and MSs, i.e., Mb = N = 1, the per-BS power constraints in © reduce to the per-antenna 
power constraints for an equivalent MISO BC. 

We assume a quasi-static fading environment and thus the channels of interest in the auxiliary MIMO-BC remain 
constant for each downlink transmission frame. We consider the BD precoding scheme for each downlink frame 
transmission in the MIMO BC, which eliminates the inter-user interference, i.e., in (0Q) we have that for each given 
k, HjXk = or HjTk = 0,Vj ^ k. It is easy to show that the above "ZF constraints" are equivalent to the 
following constraints 

H j S k Hf = 0, Vj^k. (5) 

Assuming that the row vectors in all H^s are linearly independent (due to independent fading), from the constraints 
in d5) with k = 1, . . . , K, it follows that NK < M needs to be true in order to have a set of feasible SVs with 
Dj, = Rank(S^fc) = N, Mk, i.e., all the MSs have the same number of data streams equal to N. In practice, the total 
number of MSs in the system can be very large such that the above condition is not satisfied. In such scenarios, 
the transmissions to MSs can be scheduled into different time-slots or frequency-bands by the central processing 
unit, where in each time-slot/frequency-band, the number of MSs scheduled for transmission satisfies the above 
condition. The interested readers may refer to ll26l . E71 for the detailed design of downlink transmission scheduling 
in the MISO/MIMO BC with ZF-BF/BD precoding. For the rest of this paper, we assume that D k = N,Vk and 
NK < M for simplicity. 

We are now ready to present the weighted sum-rate maximization problem for the downlink transmission in a 
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cooperative multi-cell system with the BD precoding and per-BS power constraints as follows. 

K 




s.t. HjSkHf = 0, Mj + k 



K 



J2^(B a S k )<P, Va 



fc=l 



S k h 0, VJfc 



where w k is the given non-negative rate weight for the kth MS. For the purpose of exposition, we assume that 
w k > 0, Vfc. Note that in (PI), we have used transmit covariance matrices Sf^s instead of precoding matrices T k 's 
as design variables. This is because with SVs, it is easy to verify that (PI) is a convex optimization problem, 
since the objective function is concave over S^s and all the constraints specify a convex set over S^s. Thus, (PI) 
can be solved using standard convex optimization techniques, e.g., the interior-point method |28l . However, such 
an approach does not reveal the structure of the optimal BD precoding solution. Therefore, in this paper we take 
a different approach to solve (PI), which is based on the Lagrange duality method |[28l for convex optimization 
problems. As will be shown in Section [TTTl this approach leads to a closed-form expression for the optimal BD 
precoding matrix, and reveals some interesting properties of the optimal solution. 

Remark 2.1: It is worth noting that (PI) can be modified to incorporate additional per-antenna power constraints 
at all the BSs. Let p( pa ) denote the given per-antenna power threshold. Then, a set of M per-antenna power 
constraints can be included in (PI) as follows: 



where B\ is a diagonal matrix with the ith diagonal element equal to one and all the others equal to zero. Since 
the resulting optimization problem has similar structure to (PI), it can be solved in a similar way. In this paper, 
we omit the details for solving this modified version of (PI) for brevity. 

Remark 2.2: It is also worth noting that (PI) can be modified to solve the weighted sum-rate maximization 
problem for the cooperative multi-cell downlink transmission with the ZF-DPC precoding [11] subject to the new 
per-BS/per-antenna power constraints. With ZF-DPC, given a fixed encoding order for the transmitted signals to 
different MSs (without loss of generality, we assume that the encoding order is given by the MS index), the signal 
for a later encoded MS is designed with the non-causal knowledge of all the earlier encoded MS signals, of which 
the associated interferences can be precanceled by DPC. By extending the ZF-DPC scheme in ifTTTl for the MISO 
BC to the case of MIMO BC, (PI) can be modified to obtain the optimal ZF-DPC precoder design subject to 



A 




(6) 
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per-BS power constraints by rewriting the set of ZF constraints in (PI) as 

H 1 S k Hf = 0, Vj>k. (7) 
The resulting problem has similar structure to (PI) and can be solved similarly (the details are omitted for brevity). 

III. Proposed Solution 

In this section, we first present a new algorithm to solve (PI), which reveals the structure of the optimal BD 
precoding matrix for the general case with arbitrary numbers of antennas at the BS or MS. Then, we investigate 
the developed solution for the special case of single-antenna BSs and MSs, and compare it with other existing 
solutions in the literature. 

A. General Case 

To solve (PI), it is desirable to remove the set of ZF constraints, as follows: Define G k = [Hj , ■ ■ ■ ,H^_i, 
Hl + i, ■■■ , Hlf, k = 1, • • • , K, where G k G C LxM with L = N(K - 1). Let the (reduced) singular value 
decomposition (SVD) of G k be denoted as G k = U k Ti k V k , where U k G C LxL with Uf U k = U k U% = I, 
V k G C MxL with Vf V k = I, and E fc is a L x L positive diagonal matrix. Note that Rank(Gfc) = L < M 
under the previous assumption that NK < M. Define the projection matrix P k = (I — V k V^). Without loss of 
generality, we can express P k = V k V k , where V k G £Mx{M-L) sat j s fi es V^Vk = and V k V k = I. Note 
that [Vfc, Vk] forms a M x M unitary matrix. Then, we have the following lemma. 

Lemma 3.1: The optimal solution of (PI) is given by 

S k = VkQkVk , k = !,■■■ ,K (8) 

where Q k G C^ M - L ^ M ~ L ) and Q k h 0. 

Proof: Please refer to Appendix |A] ■ 
Remark 3.1: In prior works Q, QD, 0, HU) on the design of BD precoder for the MIMO BC with the sum- 
power constraint, it has been observed that the columns (precoding vectors) in the BD precoding matrix for the 
Mi MS, T k , with T k T k = S k , should be linear combinations of those in V k in order to satisfy the constraints: 
HjT k = 0,Vj 7^ k. Lemma [3TTI extends this result to the case of per-BS power constraints. 



With the optimal structures for S k s given in Lemma [3TT1 it can be verified that all the ZF constraints in (PI) 
are satisfied and thus can be removed. Thus, (PI) reduces to the following equivalent problem 

K 

(P2): max w k log 1 1 + H k V k Q k V k H? 

Qi Qk k =i 

K 

s.t. ^Tr(s a F fc Q fc vf ) <p, Va 

k=l 



Q k t o, vfc. 

Similar to (PI), it can be shown that (P2) is convex. Thus, (P2) is solvable by the Lagrange duality method as 
follows. By introducing a set of non-negative dual variables, /i a , a = 1, • • • , A, associated with the set of per-BS 
power constraints in (P2), the Lagrangian function of (P2) can be written as 

K A / K \ 

L({Q k }, W) = E Wk lo S | 7 + H k V k Q k V k H*\ - 5>« I X> (SaVfcQ.Ff ) - P (9) 

fe=l a=l \fc=l / 

where {Q^} and {^ a } denote the set of Q^'s and the set of /i a 's, respectively. The Lagrange dual function for 
(P2) is then defined as 

<?({/Vr) = max L({Q k },{pa}). (10) 
Q fc b0,vfe 

Moreover, the dual problem of (P2) is defined as 

(P2-D): min g({fi a }). 

jLt o >0,Vo 

Since (P2) is convex and satisfies the Slater's condition [28], the duality gap between the optimal objective value 
of (P2) and that of (P2-D) is zero. Thus, (P2) can be solved equivalently by solving (P2-D). Moreover, (P2-D) 
is convex and can be solved by the subgradient-based method, e.g., the ellipsoid method [29], given the fact that 
the subgradient of function g({fi a }) at a set of fixed fi a 's is P — Y^ k =i T r \ B a V k Q k V k ) for n a , a = 1, • • • ,A, 
where {Q k } is the optimal solution for the maximization problem in (fTOl with the given set of /i a 's. 

Next, we focus on solving for {Q\} with a set of fixed /x a 's. From (O, it is observed that the maximization 
problem in ( fTOl ) can be separated into K independent subproblems each involving only one Q k . By discarding the 
irrelevant terms, the corresponding subproblem, for a given k, can be expressed as 

(P3) : max w k log J + H k V k Q k V k H» - Tr (BJf k Q k V* ) 
Q k tO \ i 

where = Yl a =i VaBa- Note that B^ is a diagonal matrix with the diagonal elements given by different /x a 's 
in the order of a = 1, • • • , A. We then have the following lemma. 

Lemma 3.2: Let A^ denote the number of fj, a 's in the main diagonal of B^, a G {1, • • • , A}, with fi a > 0. 
Then, for (P3) to have a bounded objective value, it holds that A^ > \ M ~ N ^ ^ 1 ■ 



to 

Proof: Please refer to Appendix |B] ■ 
Remark 3.2: It is noted that by applying the Karash-Kuhn-Tucker (KKT) conditions [28] to (P2), the fact that 
fi a > for a given a G {1, ■ • • , A} implies that the corresponding power constraint must be tight with the optimal 
solution for {Q k }. Accordingly, in (PI) the optimal downlink transmit covariance matrices S k s must make the 
ath per-BS power constraint tight. Therefore, Lemma I3T21 provides a lower bound on the number of BSs for which 
the corresponding transmit power constraints must be tight with the optimal SVs for (PI). 

With Lemma [372] and L = N(K — I), we can assume without loss of generality that MbA^ > (M — L) since we 
are only interested in the case where the objective value of (P3) and that of (PI) are both bounded. Accordingly, 
we have Rank(vf B^V k ) = uAn{M B A^M - L) = M - L. Thus, V^B^Vk € c( M - L ) x ( M - L ) is a full- 
rank matrix and its inverse exists. Moreover, since Ti(XY) = Tx{Y X), in (P3) we have Tv(B^V k Q k V k ) = 
Tr((V k Bjr k )V*Q k {V* B M Vfc) 1/2 ). We thus define 

Q k = {V k B^V k ) l / 2 Q k {V k B^V k )^ 2 . (11) 

Then, (P3) can be reformulated to maximize 



W k log 



Tr (Q k ) (12) 



I + H k V k {V k B M V fc )-V2Q fc (vJ B^V k )- l ' 2 V k iff 

subject to Q k y 0. Note that Rank(H k V k (V k B^V k )~~ l/2 ) = min(iV,M - L) = N. Thus, the following 
(reduced) SVD can be obtained as 

H k V k (V k B^Vk)- 1 ' 2 = U k ± k V k (13) 

where U k € C NxN , V k e £(M-L)xN^ and £ fc = Diag(<7 M , • • • ,a k)N ). Substituting the above SVD into 
(fT2l and applying the Hadamard's inequality (see, e.g., 1130 1) yields the following optimal solution for (fT2l as 
Q k = V k A k V k , where A k = Diag(Afe ) i, ■ • • , ^k,N), where \ k ^, i = 1, ■ ■ ■ ,N, can be obtained by the standard 
water-filling algorithm ll30l as 

where (x) + = max(0, x). To summarize, the optimal solution of (P3) for a given set of ^ a 's can be expressed as 
Qt = {V k BpV k )-V 2 V k A k V% (vf B^Vk)- 1 ' 2 , k = 1, • • • , K. (15) 

Note that when the optimal solution for {/x a } in (P2-D) is obtained, the corresponding solution in (fT5l) becomes 
optimal for (P2). By combining this result with Lemma 13.11 we obtain the following theorem. 
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Theorem 3.1: The optimal solution of (PI) is given by 

SI = V k {V k BlV k )~ 1/2 V k A k V k ( V k B^V k )- 1 ' 2 V k , k = l r --,K (16) 

where 5* = Ylit=i ^a^a, with /i*'s being the optimal dual solutions of (P2). 
The algorithm for solving (PI) is summarized as follows. 
Algorithm (Al): 

• Initialize pL a > 0, a = 1, • • • , A. 

• Repeat 

1. Solve Q k , k = 1, • • • , -ftT using ( fT51 ) with the given ^ a 's; 

2. Compute the subgradient of g{{^ a }) as P — X^fcLi Tr ^a^QfcV'^J , a = 1, • • • , A, and update /Ua's 
accordingly based on the ellipsoid method ||29lk 

• Until all the /x a 's converge to a prescribed accuracy. 
. SetS* k = V k Q* k vf,k = l,--- ,K. 

From Theorem 13. II and the fact that S k = T k T k ,\/k, we obtain the following corollary. 
Corollary 3.1: The optimal BD precoding matrices to maximize the weighted sum-rate for the MIMO-BC subject 
to the per-BS power constraints in (O are given by 

T% = VkiV^BlVkY^VkK]! 2 , k = 1, . . . , K. (17) 

In the following remarks, we discuss some interesting observations on the optimal BD precoding matrices given 
by C[7]). 

Remark 3.3 (Channel Diagonalization): One desirable property of linear precoding for a point-to-point MIMO 
channel is that the precoding matrix, when jointly deployed with a unitary decoding matrix at the receiver, is able 
to diagonalize the MIMO channel into parallel scalar sub-channels, over which independent encoding and decoding 
can be applied to simplify the transceiver design. Here, we verify that the optimal T* k given in ( TT7T ) satisfies this 
"channel diagonalization" property, as follows: 

H k T* k = H k V k (V k B^V k )-^ 2 V k K]! 2 (18) 
= U k ± k V k V k Al /2 (19) 
= U k ± k Al /2 (20) 

where ( fT9l is due to ( fT3l ). Therefore, with a unitary decoding matrix U k applied at the kth MS receiver, the 
MIMO channel for the kth MS with BD precoding is diagonalized into scalar sub-channels with channel gains 
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1/2 

given by the main diagonal of the diagonal matrix S^A fc . It is easy to verify that the above linear precoder and 
decoder processing preserves the single-user MIMO channel capacity. 

Remark 3.4 (Comparison with Conventional Sum-Power Constraint): It is noted that (PI) can be modified to 
deal with the case where a single sum-power constraint over all the BSs (instead of a set of per-BS power 
constraints) is applied. This can be done via replacing the set of per-BS power constraints in (PI) by 

K 

Tr(S k ) < P (sum) (21) 

fc=i 

where p( sum ) denotes the given sum-power constraint. Note that (PI) in this case corresponds to the conventional 
BD precoder design problem for the MIMO BC with a sum-power constraint as studied in Q, (H, |9) s ifTUl . It 
can be shown that the developed solution for (PI) can be applied to this case, while the corresponding matrix £?* 
should be modified as p,*I with \i* denoting the optimal dual solution associated with the sum-power constraint 
in (f2TT >. From (TT6l) . it follows that the optimal solution for this modified version of (PI) is given by 

Sk = V k V k A k vf vf , k = 1, • • • , K. (22) 

A ^ A A A £J 

Moreover, from ([13]) with = //T, it follows that V k is obtained from the SVD: -j=H k V k = U k T, k V k 
and is thus independent of Accordingly, the optimal precoding matrix in the sum-power constraint case is 
T\* = -}==V k V k Al /2 . Comparing T* k with T% in C7]) for the per-BS power constraint case, we see that T* k * 
consists of orthogonal columns (beamforming vectors) since V k V k V k V k = I, while T k in general consists 
of non-orthogonal columns if £?* is a non-identity diagonal matrix (i.e., the optimal ^*'s are not all equal). This 
is the very reason that the BD precoder designs in prior works |[T9l , |[20l . (21] based on the orthogonal precoder 
structure T* k are in general suboptimal for the per-BS power constraint case. 

B. Special Case: MISO BC with Per-Antenna Power Constraints 

In this subsection, we investigate the developed solution for the special case of Mb = N = 1, where the 
auxiliary MIMO BC with the per-BS power constraints reduces to an equivalent MISO BC with the corresponding 
per-antenna power constraints, and the BD precoding reduces to the ZF-BF precoding. With N = 1, H k is a 
row-vector, which we denote by h k G C Mxl , k = 1, • • • , K. Accordingly, the SVD in ([TBI is rewritten as 

fef V*(vf B^V k )- 1 ' 2 = a k v% (23) 



13 

where a k > and v k € c( M_L ) xl . From flU), (O, and ([23]), it follows that the optimal downlink transmit 
covariance matrix for the fcth MS, S k , in the case of N = 1 is expressed as 

St = X k V k {V k BlV k )-y 2 v k v* (vf B*V k )-y 2 V^ (24) 
= A.^^^yf^F^-^f^/ifF^yf^Ffc)- 1 ^ (25) 

where Afc = (w k — l/a k l ) + . It is thus easy to observe that in this case Rank(S'^) < 1. Thus, the corresponding 
optimal precoding matrix reduces to a (beamforming) vector denoted by t k € C , where S k = t k (t k ) H and 

t% = X^a^VkiV^BlVkY^hk. (26) 

Note that (l26l ) holds regardless of Mb, but Mb = 1 corresponds to the per-antenna power constraint case for the 
MISO BC. Furthermore, the optimal beamforming vector for the /cth MS in the conventional sum-power constraint 
case (with £?* = is obtained from (l26l ) as 

tf = X 1 k / % 1 (n*r 1 V k vfh k . (27) 

In the following, we discuss some interesting observations on the optimal ZF-BF precoding design in d26l ). as 
compared with other prior results reported in J4]], ll24l . IT231 . 

Remark 3.5: Denote T = [t\, ■ ■ ■ ,tx] G C MxK as the precoding matrix for a MISO BC with M transmitting 
antennas and K single-antenna receiving MSs. Then, for the sum-power constraint case with t k = t* k given in 
d271 ), the corresponding optimal precoding matrix T** becomes the conventional ZF-BF design for the MISO BC 
based on the channel pseudo inverse |4], i.e., T** can be put in the form T** = H'A, where H = [h\, ■ ■ ■ , hx] H 

~ 1/2 

and A = Diag(Ai, • • • , Xr), where X k = X k a k , k = 1, • • • , K. However, it is observed that the ZF-BF design 
based on the channel pseudo inverse is in general suboptimal for the MISO BC with the per-antenna/per-BS power 
constraints, where the optimal precoding matrix T* is obtained with t k = ft given in (l26l ). Note that ft becomes 
collinear with tt* regardless of /i*'s when M = K. In this case, V k becomes a vector, v k G C , and t k and t k * 
can both be written in the form p k v k , with p k > 0. Furthermore, it can be shown that this result holds regardless 
of the value of M B provided that N = 1 and M = M B A = K. 

Remark 3.6: In ESI , the authors proposed a ZF-BF precoding design for the MISO BC with per-antenna power 
constraints in the form of the generalized inverse of H. The corresponding precoding matrix is expressed as 

T = [g iai ,--- ,g K a K ] + U L [b 1: --- ,b K ] (28) 

where g k is the normalized (to unit-norm) Mi column in H\ k = 1, • • • , K; U 1 - € <C Mx( - M ~ K ^ is a projection 
matrix onto the orthogonal complement of the space spanned by the row vectors in H, (U ± ) H U~ L = I; a k 's 



14 



and fofc's are design variables, k = 1, • • • , K. In other words, each beamforming vector t k in T given by (1281) 
is a linear combination of g k and the columns in U . We see that the beamforming vectors given in d28l) are 
in accordance with the optimal t£'s given in d26l ) due to the fact that for the MISO BC with N = 1 and thus 
L = M - N(K - 1) = M — K + 1, the space spanned by the columns in Ffc G C Mxi is the same as that 
spanned by g k and the columns in U^. Note that in [24], an algorithm is proposed to obtain the ZF-BF precoding 
matrix for the MISO BC with per-antenna power constraints by numerically searching over a^'s and b k s> in (|28T ). 
In this paper, the optimal ZF-BF precoders are found based on the closed-form expression in d26l ) and applying a 
numerical search over the dual variable /x a 's by the ellipsoid method. 

Remark 3.7: It is also worth comparing the proposed method for solving (PI) in the MISO BC case with the 
method presented in ||25l . For the method in [25], a set of transmit beamforming vectors, t\, ■ ■ ■ , tj(, are used in a 
MISO BC. Thus, the weighted sum-rate maximization problem with the ZF-BF precoding and per-antenna power 
constraints can be formulated as 

K 

(P4) : max ^ w k log (l + \\h% t k \\ 2 ) 
ti, ,t K k=1 

s.t. hft k = 0,Vj^k 
K 



^Tr^tf) <P, Vi 



k=l 

where Bi € <C MxM denotes a diagonal matrix with the ith diagonal element equal to one and all the others equal 
to zero, i = 1, • • • , M; and P refers to the per-antenna power constraint. Note that (P4) is non-convex due to the 
fact that the objective function is not necessarily concave over i^'s. In l|25l . it is proposed to convert (P4) into an 
equivalent problem in terms of S k = t k t k , k = 1, • • • , K, which is expressed as 

K 

(P5): max w * lo S i 1 + h k S kh k ) 

Si,-, Sx k=1 

s.t. hfS k hj = 0, Vj + k 

K 
k=l 

S k y 0, VA; 
Rank(S'fc) = 1, V/c. 

Note that (P5) can be treated as (PI) in the case of N = 1 and M B = 1 (thus M = M B A = A), and with 
an additional set of rank-one constraints for S k s. However, these rank-one constraints are non-convex and thus 
render (P5) non-convex in general. As a special form of (PI), (P5) without the rank-one constraints is convex, and 
thus can be solved efficiently by, e.g., the interior-point method ll28l . However, the obtained solution for Sk is not 
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guarantied to be rank-one. In M251 . it is proved that there always exists a solution that consists of a set of rank-one 
S k s for (P5), and a method is provided to construct the rank-one solution from the corresponding solution (with 
rank greater than one) of (P5) without the rank-one constraints. In contrast, the proposed method in this paper 
obtains the closed-form solution for (P5), which, as given in d25l ). is guaranteed to be rank-one. 



IV. Suboptimal Solution 

In this section, we propose a suboptimal solution for (PI), which can be obtained with less complexity than 
the optimal solution. First, we define the projected channel of H k associated with the projection matrix P k as 



Hi 



H k P k = H k V k V k ,k = !,-•• ,K, where Hi € 



■<NxM 



Next, define the (reduced) SVD of H^ as 



Hi 



, and Rank(H-t) = min(iV,M - L) = N. 



(29) 



where Uj: G C NxN , = Diag(cr^ 1 , • • • ,<r£jy), and V k G C MxN . Then, the proposed suboptimal solution for 
(PI) is give by 



S k = ViA k (Vi) 



H 



(30) 



where A k = Diag(Afc,i, • • • , X k .N) denotes the power allocation for the kth MS. It is worth noting that the above 
solution for S k is in general suboptimal for (PI) by comparing it with the optimal solution in (fT6l ). Also note 
that (l30l is optimal for the sum-power constraint case as discussed in Section HIH-A. since it can be shown that in 
(1221) V k V k = V k with S* = fj,*I. With S k 's given in (O, it can be shown that the ZF constraints in (PI) are 
satisfied and thus can be removed; furthermore, in the objective function of (PI), the following equalities hold: 



log\l + H k S k H%\ 

■■ log | J + (Hi + H k V k V%)S k {Hi + H k V k Vjf f 
Aog\l + HiS k (Hi) H 
:log|/ + [/^A fc £^(C/^ 

log / + (s^) 2 A fc 



(31) 
(32) 
(33) 
(34) 
(35) 



where (J33J) is due to the fact that V k V k + V k V^ = I, ((33]) is due to the fact that S k V k = since V k V k = 0; 



p4j) is due to (|29J) and (T30J; and (J35J) is due to the fact that log \I + XY\ = log \I + YX\. From (1351) . we see 
that the MIMO channel for the kth MS is diagonalized into iV scalar sub-channels with channel gains given by 
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Xk,ii i = 1) " " " ,N. Accordingly, (PI) is reduced to the following problem 

K N 

(P6): max. ^ w k ^ log (l + (a^) 2 A fcji ) 
{Afc ' } fc=i i=i 
if AT 

fc=l i=l 

Afc,j > 0, VAs, « 

where {A^} denotes the set of A& j's, fe = 1, • • ■ , if and % = 1, • • ■ , A r , while [a, i] denotes the vector consisting 
of the elements from the ith column and the ((a — 1)Mb + l)-th to (aMe )-th rows in V k ,a = 1, • • • , A and 
i = 1, • • ■ , N. It can be verified that (P6) is a convex optimization problem. Thus, similar to (P2), the Lagrange 
duality method can be applied to solve (P6) by introducing a set of dual variables, fi a ,a = 1, ••• ,A, associated 
with the set of per-BS power constraints in (P6). For brevity, we omit here the details for the derivation and present 
the optimal solution (power allocation) for {A^} as follows: 

AM= (Eli^K[a,i]ll 2 "R?) ■ ^ 
Similar to (Al), the following algorithm can be used to obtain the proposed suboptimal solution for (PI). 

Algorithm (A2): 

• Initialize pL a > 0, a = 1, • • • , A. 

. Compute the SVDs: H k V k v" = U k L 'E k L (V k L ) H ,k = ,K. 

• Repeat 

1. Solve {\k,i} using (l36l ) with the given /z a 's; 

2. Compute the subgradient of the dual function for (P6) as P — ^2 k= i E^Li ||W"[a, i] \\ 2 Xk,i, a = 1, • • • , A, 
and update /x a 's accordingly based on the ellipsoid method [29]; 

• Until all the // a 's converge to a prescribed accuracy. 
. SetS k = V k L A k (V k L ) H ,k = l,--- ,K. 

As compared with (Al), (A2) has a lower complexity due to the fact that for each loop in the "Repeat", only 
the power allocation computation in (l36l ) is implemented, instead of the precoding matrix computation given in 
(fT5T ). Due to the suboptimal structure of the downlink transmit covariance matrix in (l30l ) for (A2) as compared to 
the optimal one in (fT6l ) for (Al), (A2) in general leads to a suboptimal solution for (PI), except in the special case 
of N = 1 and M = K where the transmit covariance structure in (l30l ) is known to be optimal (see Remark 13.5b - 
In this special case, (A2) can be used as an alternative algorithm to (Al) to obtain the optimal solution for (PI). 
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Remark 4.1: It is worth noting that (A2) can be shown equivalent to the algorithm proposed in [19] for the 
special case of Mb = N = 1, i.e., the MISO BC with the ZF-BF precoding and the per-antenna power constraints. 
In this case, similar to Remark 13.51 the proposed suboptimal solution in (l3Qb corresponds to a precoding matrix 
in the form T = H^&, where H denotes the downlink MISO-BC channel, and is a diagonal matrix with 
the main diagonal that has been optimized in a similar way as we have used for (P6). According to our previous 
discussions, this algorithm is indeed suboptimal for (PI) if M > K. 

At last, as a counterpart of Lemma 13.21 we have the following lemma. 

Lemma 4.1: Let A* denote the number of active per-BS power constraints with the optimal solution for (P6). 
It then holds that A* < NK. 

Proof: Please refer to Appendix O ■ 
Lemma |4~T1 provides an upper bound on the number of active per-BS power constraints for the suboptimal solution 
of (PI) obtained by (A2). It follows that in the case of (A/NK) ^> 1, most of the BSs in the cooperative multi-cell 
system cannot transmit with their full power levels with the suboptimal BD precoder design obtained by (A2). 

V. Numerical Examples 

In this section, we provide numerical examples to illustrate the results in this paper. For the purpose of exposition, 
we assume that the channel HkS in (0Q) are independent over k, and all the elements in each channel matrix are 
independent CSCG random variables with zero mean and unit variance. Moreover, we consider the sum-rate 
maximization for the cooperative multi-cell downlink transmission, i.e., w^s are all equal to one in (PI). The 
obtained numerical results along with related discussions are presented in the following subsections. 

A. Convergence Behavior 

In Fig. [T] we show the convergence behavior of Algorithm (Al) for solving (PI). It is assumed that A = 
2, Mb = 4, K = 4, and N = 2. The transmit power constraint P for each of the two BSs is set equal to 10. The 
initial values assigned to ^ a 's in (Al) are \i\ = \ii = 0.2. The achievable sum-rate and the consumed transmit 
powers by the two BSs are shown for different iterations in (Al), each with a pair of updated values for \i\ and 
fi2- As observed, the plotted rate and power values all converge to fixed values after around 30 iterations. The 
converged transmit powers for the two BSs are observed both equal to their given constraint value, which is 10. 
A similar convergence behavior for Algorithm (A2) can be observed and thus omitted here. Generally speaking, 
the convergence speed of both (Al) and (A2) depends critically on the total number of per-BS power constraints, 
A, which is also the number of dual variable /x a 's to be searched. With the ellipsoid method, it is known that the 
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complexity for searching fi a 's in (Al) or (A2) is 0(A 2 ) for large values of A [29]. Thus, the number of iterations 
for the algorithm convergence grows asymptotically in the order of the square of the number of BSs in the system. 

B. MISO BC with Per-Antenna Power Constraints 

Next, we consider a special case of the cooperative multi-cell downlink transmission with Mb = N = 1, which 
is equivalent to a MISO BC with the corresponding per-antenna power constraints. The per-BS/per-antenna power 
constraint is assumed to be P = 10. In Fig. |2j we compare the achievable sum-rate with the optimal ZF-BF 
precoder obtained by (Al) against that with the suboptimal precoder obtained by (A2). The number of MSs is 
fixed as K = 2, while the total number of transmitting antennas M ranges from 2 to 10. It is observed that 
when M = K = 2, the achievable rates for both the optimal and suboptimal precoders are identical, which is 
in accordance with our discussions in Section [TV] It is also observed that when M > K, the sum-rate gain of 
the optimal precoder solution over the suboptimal solution increases with M. In order to explain this observation, 
in Fig. [3] we show the histograms for the number of active per-antenna power constraints with the optimal and 
suboptimal solutions over 100 random MISO-BC realizations for the case of M = 8. It is observed that the number 
of active per-antenna power constraints with the optimal solution is always no less than = 7, while 

that with the suboptimal solution is always no larger than NK = 2, in accordance with Lemmas 13.21 and 14.11 
respectively. We thus see that when M becomes much larger than K for the MISO BC, the optimal ZF-BF design 
can utilize the full transmit powers from at least (M — K + 1) antennas, while the suboptimal design can only 
have at most K antennas transmitting with their full powers. This explains why in Fig. [2] the rate gap between the 
optimal and suboptimal ZF-BF designs enlarges as M increases with a fixed K. 

C. M1MO BC with Per-Antenna Power Constraints 

Last, we consider the case of multi-antenna MS receivers. For the corresponding auxiliary MIMO BC, we assume 
that A = A, M B = 1,K = 2, and N = 2. Note that in this case although M = NK, i.e., the total number of 
transmitting antennas are equal to that of receiving antennas, (A2) in general leads to a suboptimal solution for 
(PI) due to the fact that N > 1. In Fig. HI we show the achievable sum-rates for both the optimal and suboptimal 
BD precoders vs. the per-BS/per-antenna transmit power constraint P. It is observed that although the optimal 
precoder solution still performs better than the suboptimal one, their rate gap is not as large as that in Fig. [2] when 
M > NK and P = 10. This is due to the fact that in the case of M = NK, although the maximum number 
of antennas transmitting with full powers for the suboptimal solution is still limited by NK according to Lemma 
14- 1 1 such a constraint is not useful since Mb = 1 and A = NK. The practical rule of thumb here is that when 
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Mb = 1 and A is not substantially larger than NK, the low-complexity suboptimal BD precoder obtained by (A2) 
can be applied to achieve the sum-rate performance close to that of the optimal BD precoder obtained by (Al). 

VI. Conclusion 

This paper studies the design of block diagonalization (BD) linear precoder for the fully cooperative multi-cell 
downlink transmission subject to individual power constraints for the base stations (BSs). By applying convex 
optimization techniques, this paper derives the closed-form expression for the optimal BD precoding matrix to 
maximize the weighted sum-rate of all users in the multi-cell system. The optimal BD precoding vectors for 
each user are shown to be in general non-orthogonal, which differs from the conventional orthogonal precoder 
design for the sum-power constraint case. A suboptimal heuristic method is also proposed, which combines 
the conventional orthogonal BD precoder design with an optimized power allocation to meet the per-BS power 
constraints. Furthermore, this paper shows that the proposed optimal BD precoder solution provides the optimal 
zero-forcing beamforming (ZF-BF) solution for the special case of MISO BC with per-antenna power constraints. 
The results in this paper are readily extended to obtain the optimal BD precoders for the MIMO-BC with general 
linear transmit power constraints, which include per-antenna/per-BS power constraints as special cases. 

Appendix A 
Proof of Lemma I3TT1 

Let {S\, • • • , S* K } denote the optimal solution of (PI). Without loss of generality, for any given k G {1, • • • , K}, 
we can express S£ in the following form 

[Vk,V k ) H (37) 
= V k AV" + V k BV% + V k B H V k + V k CV% (38) 

where A G C^ M ~ L ^ M - L \ B G c( M - L ) xL , and C G C LxL . Note that A = A H and C = C H . Since S\ must 
satisfy the set of ZF constraints in (PI), it follows that 

VgS* k V k = 0. (39) 

From (l38l) and d39l ), it follows that C = 0. Furthermore, from the theory of Schur complement |[28l . it is known 
that Sfr y if and only if (iff) the following conditions are satisfied: 

A y (40) 
(I-AA^)B = (41) 
C - B H A ] B y 0. (42) 



B 



H 



B 

C 
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Since A y 0, it follows that B H A^B y 0. Using this fact together with C = 0, from (02]) it follows that 
B H A ] B = 0. Thus, from gB it follows that B = 0. With B = and C = 0, from @Ul it follows that 
S* k = VkAV k . By letting A = Q k , the proof of Lemma [3TT1 thus follows. 

Appendix B 
Proof of Lemma 13721 

We prove Lemma 13.21 by contradiction. Suppose that there exist a number of strictly positive /x a 's such that 

A V < \ M ~ N Mb Then ' " foll0WS that A M < M ~Aff ^ SillCe L = ^("^ - X )' k thUS folloWS that M BA^ < 

(M — L). Let 5 denote the set consisting of the indices corresponding to all the non-zero diagonal elements in 
Bfj,, i.e., if \i a > for any a G {1, • • • , ^4}, then (a — 1)Mb + i € 5, i = 1, • • • , Mg. Note that the size of S is 
denoted by |«S| = M B A^. Let E k (S) and F k (S c ) denote the matrix consisting of the rows in V k £ £Mx(M-L) 
with the row indices given by the elements in S and S c , respectively, where S c denotes the complement of S. 
Note that |«S| + |5 C | = M and |5 C | > since MbA^ < (M - L) < M. From E k (S) G cM b a,MM-L) and 
M B A^ < (M - L), it follows that E k (S) is not full row-rank. Thus, we could find a vector q k G c (M_L)xl 
with ||q fc || = 1 such that E k (S)q k = and F k (S c )q k / 0. Accordingly, we have B^V k q k = and V k q k ^ 0. 
Denote w k = V k q k . Note that the indices of the non-zero elements in w k belong to S c . Suppose that the solution 
of (P3) is taken as Q k = p(q k q k ) with p > 0. Substituting this solution into the objective function of (P3) yields 



w k log 



I + H k V k QlV k H^ 



TvB,V k QiV k ) (43) 



= w k log | J + P H k w k w%H% | . (44) 

Let R k = H k w k . Then, (l44l) can be further expressed as w k log | J + pR k R k \, whose value becomes unbounded 
as p — > oo provided that RkR k 7^ (which holds with probability one due to independent channel realizations). 
Therefore, we conclude that the presumption that A^ < \ M ~^^~ 1 ^ ] cannot be true. Lemma [372] thus follows. 

Appendix C 
Proof of Lemma |4~T1 

We prove Lemma [4~T1 by contradiction. Suppose that A* > (NK+1). Let B be a subset of {1, • • • , ^4} consisting 
of the indices of the BSs for which the transmit power constraints are tight with the optimal solution for (P6). 
Note that \B\ = A*. Let A£ i denote the optimal solution for (P6), k = 1, • • • , K and i = 1, • • ■ , N. Thus, we have 
the following equalities from (P6) 

K N 

^^^M|| 2 A^ = P, VflGB. (45) 

k=l i=l 
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Accordingly, A£ are the solutions for a set of A* linear independent (which holds with probability one due to 
independent channel realizations) equations. However, since A* > (NK + 1), we see that the number of equations 
exceeds that of unknowns, which is equal to NK. Thus, given P > 0, there exist no feasible solutions for A£ /s. 
We thus conclude that the presumption that A* > (NK + 1) cannot be true. Lemma |4~T1 thus follows. 
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Fig. 1. Convergence behavior of Algorithm (Al). 
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Comparison of the sum-rate in the MISO BC with the ZF-BF precoding and the per-antenna power constraints. 
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Fig. 4. Comparison of the sum-rate in the MIMO BC with the BD precoding and the per-antenna/per-BS power constraints. 



